AITopics | polynomial growth

2604.13213

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Control Systems (0.90)

Mali, Ankur, Hall, Lawrence, Williams, Jake, Richards, Gordon

Integral Signatures of Activation Functions: A 9-Dimensional Taxonomy and Stability Theory for Deep Learning

arXiv.org Artificial IntelligenceOct-10-2025

Activation functions govern the expressivity and stability of neural networks, yet existing comparisons remain largely heuristic. We propose a rigorous framework for their classification via a nine-dimensional integral signature S_sigma(phi), combining Gaussian propagation statistics (m1, g1, g2, m2, eta), asymptotic slopes (alpha_plus, alpha_minus), and regularity measures (TV(phi'), C(phi)). This taxonomy establishes well-posedness, affine reparameterization laws with bias, and closure under bounded slope variation. Dynamical analysis yields Lyapunov theorems with explicit descent constants and identifies variance stability regions through (m2', g2). From a kernel perspective, we derive dimension-free Hessian bounds and connect smoothness to bounded variation of phi'. Applying the framework, we classify eight standard activations (ReLU, leaky-ReLU, tanh, sigmoid, Swish, GELU, Mish, TeLU), proving sharp distinctions between saturating, linear-growth, and smooth families. Numerical Gauss-Hermite and Monte Carlo validation confirms theoretical predictions. Our framework provides principled design guidance, moving activation choice from trial-and-error to provable stability and kernel conditioning.

artificial intelligence, deep learning, machine learning, (20 more...)

2510.08456

Country:

North America > United States > Florida > Hillsborough County > Tampa (0.14)
Africa > Mali (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Sonoda, Sho, Hashimoto, Yuka, Ishikawa, Isao, Ikeda, Masahiro

Generalization Through Growth: Hidden Dynamics Controls Depth Dependence

arXiv.org Machine LearningMay-22-2025

Recent theory has reduced the depth dependence of generalization bounds from exponential to polynomial and even depth-independent rates, yet these results remain tied to specific architectures and Euclidean inputs. We present a unified framework for arbitrary \blue{pseudo-metric} spaces in which a depth-$k$ network is the composition of continuous hidden maps $f:\mathcal{X}\to \mathcal{X}$ and an output map $h:\mathcal{X}\to \mathbb{R}$. The resulting bound $O(\sqrt{(α+ \log β(k))/n})$ isolates the sole depth contribution in $β(k)$, the word-ball growth of the semigroup generated by the hidden layers. By Gromov's theorem polynomial (resp. exponential) growth corresponds to virtually nilpotent (resp. expanding) dynamics, revealing a geometric dichotomy behind existing $O(\sqrt{k})$ (sublinear depth) and $\tilde{O}(1)$ (depth-independent) rates. We further provide covering-number estimates showing that expanding dynamics yield an exponential parameter saving via compositional expressivity. Our results decouple specification from implementation, offering architecture-agnostic and dynamical-systems-aware guarantees applicable to modern deep-learning paradigms such as test-time inference and diffusion models.

artificial intelligence, machine learning, semigroup, (17 more...)

2505.15064

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Frikha, Noufel, Pham, Huyên, Song, Xuanye

Full error analysis of policy gradient learning algorithms for exploratory linear quadratic mean-field control problem in continuous time with common noise

arXiv.org Machine LearningAug-5-2024

We consider reinforcement learning (RL) methods for finding optimal policies in linear quadratic (LQ) mean field control (MFC) problems over an infinite horizon in continuous time, with common noise and entropy regularization. We study policy gradient (PG) learning and first demonstrate convergence in a model-based setting by establishing a suitable gradient domination condition.Next, our main contribution is a comprehensive error analysis, where we prove the global linear convergence and sample complexity of the PG algorithm with two-point gradient estimates in a model-free setting with unknown parameters. In this setting, the parameterized optimal policies are learned from samples of the states and population distribution.Finally, we provide numerical evidence supporting the convergence of our implemented algorithms.

algorithm, convergence, gradient, (16 more...)

2408.02489

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

arXiv.org Artificial IntelligenceMay-26-2024

Reinforcement Learning for Jump-Diffusions

Gao, Xuefeng, Li, Lingfei, Zhou, Xun Yu

We study continuous-time reinforcement learning (RL) for stochastic control in which system dynamics are governed by jump-diffusion processes. We formulate an entropy-regularized exploratory control problem with stochastic policies to capture the exploration--exploitation balance essential for RL. Unlike the pure diffusion case initially studied by Wang et al. (2020), the derivation of the exploratory dynamics under jump-diffusions calls for a careful formulation of the jump part. Through a theoretical analysis, we find that one can simply use the same policy evaluation and q-learning algorithms in Jia and Zhou (2022a, 2023), originally developed for controlled diffusions, without needing to check a priori whether the underlying data come from a pure diffusion or a jump-diffusion. However, we show that the presence of jumps ought to affect parameterizations of actors and critics in general. Finally, we investigate as an application the mean-variance portfolio selection problem with stock price modelled as a jump-diffusion, and show that both RL algorithms and parameterizations are invariant with respect to jumps.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2405.16449

Genre: Research Report (0.50)

Industry:

Banking & Finance > Trading (1.00)
Energy > Oil & Gas > Upstream (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Ata, Baris, Harrison, J. Michael, Si, Nian

Drift Control of High-Dimensional RBM: A Computational Method Based on Neural Networks

arXiv.org Artificial IntelligenceDec-18-2023

Motivated by applications in queueing theory, we consider a stochastic control problem whose state space is the $d$-dimensional positive orthant. The controlled process $Z$ evolves as a reflected Brownian motion whose covariance matrix is exogenously specified, as are its directions of reflection from the orthant's boundary surfaces. A system manager chooses a drift vector $\theta(t)$ at each time $t$ based on the history of $Z$, and the cost rate at time $t$ depends on both $Z(t)$ and $\theta(t)$. In our initial problem formulation, the objective is to minimize expected discounted cost over an infinite planning horizon, after which we treat the corresponding ergodic control problem. Extending earlier work by Han et al. (Proceedings of the National Academy of Sciences, 2018, 8505-8510), we develop and illustrate a simulation-based computational method that relies heavily on deep neural network technology. For test problems studied thus far, our method is accurate to within a fraction of one percent, and is computationally feasible in dimensions up to at least $d=30$.

control problem, equation, polynomial growth, (15 more...)

2309.11651

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Georgia > Chatham County > Savannah (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

arXiv.org Machine LearningDec-18-2023

Wide Deep Neural Networks with Gaussian Weights are Very Close to Gaussian Processes

Trevisan, Dario

We establish novel rates for the Gaussian approximation of random deep neural networks with Gaussian parameters (weights and biases) and Lipschitz activation functions, in the wide limit. Our bounds apply for the joint output of a network evaluated any finite input set, provided a certain non-degeneracy condition of the infinite-width covariances holds. We demonstrate that the distance between the network output and the corresponding Gaussian approximation scales inversely with the width of the network, exhibiting faster convergence than the naive heuristic suggested by the central limit theorem. We also apply our bounds to obtain theoretical approximations for the exact Bayesian posterior distribution of the network, when the likelihood is a bounded Lipschitz function of the network output evaluated on a (finite) training set. This includes popular cases such as the Gaussian likelihood, i.e. exponential of minus the mean squared error.

artificial intelligence, machine learning, neural network, (20 more...)

2312.11737

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy > Tuscany > Pisa Province > Pisa (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Lytras, Iosif, Sabanis, Sotirios

Taming under isoperimetry

arXiv.org Machine LearningNov-15-2023

In this article we propose a novel taming Langevin-based scheme called $\mathbf{sTULA}$ to sample from distributions with superlinearly growing log-gradient which also satisfy a Log-Sobolev inequality. We derive non-asymptotic convergence bounds in $KL$ and consequently total variation and Wasserstein-$2$ distance from the target measure. Non-asymptotic convergence guarantees are provided for the performance of the new algorithm as an optimizer. Finally, some theoretical results on isoperimertic inequalities for distributions with superlinearly growing gradients are provided. Key findings are a Log-Sobolev inequality with constant independent of the dimension, in the presence of a higher order regularization and a Poincare inequality with constant independent of temperature and dimension under a novel non-convex theoretical framework.

artificial intelligence, inequality, machine learning, (18 more...)

2311.09003

Country:

Europe > Greece > Attica > Athens (0.04)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Cheung, Tsun-Ming, Hatami, Hamed, Hatami, Pooya, Hosseini, Kaave

Online Learning and Disambiguations of Partial Concept Classes

arXiv.org Artificial IntelligenceMar-30-2023

In a recent article, Alon, Hanneke, Holzman, and Moran (FOCS '21) introduced a unifying framework to study the learnability of classes of partial concepts. One of the central questions studied in their work is whether the learnability of a partial concept class is always inherited from the learnability of some ``extension'' of it to a total concept class. They showed this is not the case for PAC learning but left the problem open for the stronger notion of online learnability. We resolve this problem by constructing a class of partial concepts that is online learnable, but no extension of it to a class of total concepts is online learnable (or even PAC learnable).

artificial intelligence, machine learning, theorem 1, (18 more...)

2303.17578

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > United States > Ohio (0.04)
North America > United States > New York (0.04)
North America > United States > Indiana > Jackson County > Seymour (0.04)

Genre: Research Report (0.50)

Industry: Education > Educational Setting > Online (0.41)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.41)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.38)

Hariz, Kais, Kadri, Hachem, Ayache, Stéphane, Moakher, Maher, Artières, Thierry

Implicit Regularization with Polynomial Growth in Deep Tensor Factorization

arXiv.org Artificial IntelligenceJul-25-2022

Gunasekar et al. (2017) observed We study the implicit regularization effects of that for matrix factorization when there are no constraints on deep learning in tensor factorization. While implicit the rank, the solution of the optimization problem via gradient regularization in deep matrix and'shallow' descent turns out to be a low-rank matrix. Furthermore, tensor factorization via linear and certain type of they conjectured that, with small enough learning rate and non-linear neural networks promotes low-rank solutions initialization, gradient descent on full-dimensional matrix with at most quadratic growth, we show factorization converges to the solution with minimal nuclear that its effect in deep tensor factorization grows norm. Arora et al. (2019) and Razin & Cohen (2020) extended polynomially with the depth of the network. This the analysis to deep matrix factorization and showed provides a remarkably faithful description of the in this case that implicit regularization of gradient descent observed experimental behaviour. Using numerical cannot be formulated as a norm-minimization problem. By experiments, we demonstrate the benefits of studying the dynamics of gradient descent, they found theoretically this implicit regularization in yielding a more accurate and experimentally that it instead promotes sparsity estimation and better convergence properties. of the singular values of the learned matrix, indicating that implicit regularization in deep learning has to be studied from a dynamical point of view. Moreover, Razin et al. (2021) studied implicit regularization in'shallow' tensor

factorization, implicit regularization, polynomial growth, (13 more...)

2207.08942

Country:

Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
Africa > Senegal > Kolda Region > Kolda (0.04)
Africa > Middle East > Tunisia > Tunis Governorate > Tunis (0.04)
North America > United States > Maryland > Baltimore (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)